AITopics | breast cancer

Collaborating Authors

breast cancer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Attribution Impossibility: No Feature Ranking Is Faithful, Stable, and Complete Under Collinearity

Caraker, Drake, Arnold, Bryan, Rhoads, David

arXiv.org Machine LearningMay-22-2026

No feature ranking can be simultaneously faithful, stable, and complete when features are collinear. For collinear pairs, ranking reduces to a coin flip. We prove this impossibility, quantify it for four model classes, resolve it via ensemble averaging (DASH), and machine-verify it with 305 Lean 4 theorems. We characterize the complete attribution design space: exactly two families of methods exist -- faithful-complete methods (unstable, with rankings that flip up to 50% of the time) and ensemble methods like DASH (stable, reporting ties for symmetric features) -- and no method lies outside this dichotomy. The impossibility is quantitative: the attribution ratio diverges as 1/(1-rho^2) for gradient boosting, is infinite for Lasso, and converges for random forests. DASH (Diversified Aggregation of SHAP) is provably Pareto-optimal among unbiased aggregations, achieving the Cramer-Rao variance bound with a tight ensemble size formula. In a survey of 77 public datasets, 68% exhibit attribution instability. Switching to conditional SHAP does not escape the impossibility when features have equal causal effects. The framework includes practical diagnostics -- a Z-test workflow and single-model screening tool -- and has direct consequences for fairness auditing: SHAP-based proxy discrimination audits are provably unreliable under collinearity. The design space theorem, diagnostics, and impossibility are mechanically verified in Lean 4 (305 theorems from 16 axioms, 0 sorry) -- to our knowledge, the first formally verified impossibility in explainable AI.

artificial intelligence, instability, machine learning, (13 more...)

arXiv.org Machine Learning

doi: 10.5281/zenodo.19468379

2605.21492

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.45)

Industry:

Banking & Finance (0.67)
Health & Medicine > Therapeutic Area (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.86)

Add feedback

7721f1fea280e9ffae528dc78c732576-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-19-2026, 06:22:13 GMT

Dna demethylation is associated with malignant progression oflower-grade gliomas.Scientific reports,9(1):1-12, 2019.

artificial intelligence, dataset, machine learning, (18 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Therapeutic Area > Oncology (0.97)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Locally Interpretable Individualized Treatment Rules for Black-Box Decision Models

Charvadeh, Yasin Khadem, Panageas, Katherine S., Chen, Yuan

arXiv.org Machine LearningFeb-13-2026

Existing methods typically rely on either interpretable but inflexible models or highly flexible black-box approaches that sacrifice interpretability; moreover, most impose a single global decision rule across patients. We introduce the Locally Interpretable Individualized Treatment Rule (LI-ITR) method, which combines flexible machine learning models to accurately learn complex treatment outcomes with locally interpretable approximations to construct subject-specific treatment rules. LI-ITR employs variational autoencoders to generate realistic local synthetic samples and learns individualized decision rules through a mixture of interpretable experts. Simulation studies show that LI-ITR accurately recovers true subject-specific local coefficients and optimal treatment strategies. An application to precision side-effect management in breast cancer illustrates the necessity of flexible predictive modeling and highlights the practical utility of LI-ITR in estimating optimal treatment rules while providing transparent, clinically interpretable explanations.

artificial intelligence, machine learning, treatment rule, (16 more...)

arXiv.org Machine Learning

2602.1152

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

eab69250e98b1f9fc54e473cc7a69439-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 15:23:19 GMT

breast cancer, feature selection, selection, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

16009ce3d8a6872d79f056c75618911d-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 10:46:08 GMT

Many important datasets contain samples that are missing one or more feature values. Maintaining the interpretability of machine learning models in the presence of such missing data is challenging. Singly or multiply imputing missing values complicates the model's mapping from features to labels. On the other hand, reasoning on indicator variables that represent missingness introduces a potentially largenumber ofadditional terms, sacrificing sparsity.

artificial intelligence, dataset, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)

Genre: Research Report (0.68)

Industry:

Health & Medicine > Therapeutic Area (0.70)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

New study questions whether annual mammograms are necessary for most women

FOX NewsDec-19-2025, 14:00:49 GMT

Risk-based breast cancer screening matches individual genetics to mammogram frequency, showing safety compared to annual screening in a 28,000-woman study.

annual mammogram, lifestyle real estate tech science, screening, (7 more...)

FOX News

Country:

South America > Venezuela (0.05)
North America > United States > South Carolina (0.05)
North America > United States > New Jersey (0.05)
(2 more...)

Genre:

Research Report > New Finding (0.96)
Research Report > Experimental Study (0.70)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.48)
Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.42)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (0.71)

Add feedback

How Ensemble Learning Balances Accuracy and Overfitting: A Bias-Variance Perspective on Tabular Data

Mohammad, Zubair Ahmed

arXiv.org Artificial IntelligenceDec-8-2025

Abstract--Tree-based ensemble methods consistently outperform single models on tabular classification tasks, yet the conditions under which ensembles provide clear advantages--and prevent overfitting despite using high-variance base learners--are not always well understood by practitioners. We study four real-world classification problems (Breast Cancer diagnosis, Heart Disease prediction, Pima Indians Diabetes, and Credit Card Fraud detection) comparing classical single models against nine ensemble methods using five-seed repeated stratified cross-validation with statistical significance testing. Our results reveal three distinct regimes: (i) On nearly linearly separable data (Breast Cancer), well-regularized linear models achieve 97% accuracy with <2% generalization gaps; ensembles match but do not substantially exceed this performance. We systematically quantify dataset complexity through linearity scores, feature correlation, class separability, and noise estimates, explaining why different data regimes favor different model families. Cross-validated train/test accuracy and generalization-gap plots provide simple visual diagnostics for practitioners to assess when ensemble complexity is warranted. Statistical testing confirms that ensemble gains are significant on nonlinear tasks (p < 0.01) but not on near-linear data (p > 0.15). The study provides actionable guidelines for ensemble model selection in high-stakes tabular applications, with full code and reproducible experiments publicly available. A model that almost perfectly fits its training data can still fail badly on new cases. This gap between training performance and real-world behaviour is the essence of overfitting, and it is particularly problematic in domains such as medical diagnosis and financial fraud detection, where mistakes are costly: missed tumours delay treatment, and undetected fraud translates directly into monetary loss.

accuracy, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2512.05469

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.90)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.60)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Clinical-R1: Empowering Large Language Models for Faithful and Comprehensive Reasoning with Clinical Objective Relative Policy Optimization

Gu, Boyang, Zhou, Hongjian, Segal, Bradley Max, Wu, Jinge, Cao, Zeyu, Zhong, Hantao, Clifton, Lei, Liu, Fenglin, Clifton, David A.

arXiv.org Artificial IntelligenceDec-4-2025

Recent advances in large language models (LLMs) have shown strong reasoning capabilities through large-scale pretraining and post-training reinforcement learning, demonstrated by DeepSeek-R1. However, current post-training methods, such as Grouped Relative Policy Optimization (GRPO), mainly reward correctness, which is not aligned with the multi-dimensional objectives required in high-stakes fields such as medicine, where reasoning must also be faithful and comprehensive. We introduce Clinical-Objective Relative Policy Optimization (CRPO), a scalable, multi-objective, verifiable reinforcement learning method designed to align LLM post-training with clinical reasoning principles. CRPO integrates rule-based and verifiable reward signals that jointly optimize accuracy, faithfulness, and comprehensiveness without relying on human annotation. To demonstrate its effectiveness, we train Clinical-R1-3B, a 3B-parameter model for clinical reasoning. The experiments on three benchmarks demonstrate that our CRPO substantially improves reasoning on truthfulness and completeness over standard GRPO while maintaining comfortable accuracy enhancements. This framework provides a scalable pathway to align LLM reasoning with clinical objectives, enabling safer and more collaborative AI systems for healthcare while also highlighting the potential of multi-objective, verifiable RL methods in post-training scaling of LLMs for medical domains.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2512.00601

Country:

Asia (0.46)
Europe > United Kingdom (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.96)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Copula Based Fusion of Clinical and Genomic Machine Learning Risk Scores for Breast Cancer Risk Stratification

Aich, Agnideep, Hewage, Sameera, Murshed, Md Monzur

arXiv.org Machine LearningNov-25-2025

Clinical and genomic models are both used to predict breast cancer outcomes, but they are often combined using simple linear rules that do not account for how their risk scores relate, especially at the extremes. Using the METABRIC breast cancer cohort, we studied whether directly modeling the joint relationship between clinical and genomic machine learning risk scores could improve risk stratification for 5-year cancer-specific mortality. We created a binary 5-year cancer-death outcome and defined two sets of predictors: a clinical set (demographic, tumor, and treatment variables) and a genomic set (gene-expression $z$-scores). We trained several supervised classifiers, such as Random Forest and XGBoost, and used 5-fold cross-validated predicted probabilities as unbiased risk scores. These scores were converted to pseudo-observations on $(0,1)^2$ to fit Gaussian, Clayton, and Gumbel copulas. Clinical models showed good discrimination (AUC 0.783), while genomic models had moderate performance (AUC 0.681). The joint distribution was best captured by a Gaussian copula (bootstrap $p=0.997$), which suggests a symmetric, moderately strong positive relationship. When we grouped patients based on this relationship, Kaplan-Meier curves showed clear differences: patients who were high-risk in both clinical and genomic scores had much poorer survival than those high-risk in only one set. These results show that copula-based fusion works in real-world cohorts and that considering dependencies between scores can better identify patient subgroups with the worst prognosis.

copula, dependence, risk score, (15 more...)

arXiv.org Machine Learning

2511.17605

Country:

North America > United States > New York (0.04)
North America > United States > Minnesota > Blue Earth County > Mankato (0.04)
North America > United States > Louisiana > Lafayette Parish > Lafayette (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

DPSCREEN: Dynamic Personalized Screening

Kartik Ahuja, William Zame, Mihaela van der Schaar

Neural Information Processing SystemsNov-21-2025, 05:46:54 GMT

Screening is important for the diagnosis and treatment of a wide variety of diseases. A good screening policy should be personalized to the features of the patient and to the dynamic history of the patient (including the history of screening).

artificial intelligence, decision support system, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Massachusetts (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(4 more...)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Gastroenterology (0.94)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)
Information Technology > Decision Support Systems (0.68)

Add feedback